DNA nano-mechanics: how proteins deform the double helix 
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It is a standard exercise in mechanical engineer- 
ing to infer the external forces and torques on 
a body from a given static shape and known 
elastic properties. Here we apply this kind 
of analysis to distorted double-helical DNA in 
complexes with proteins: We extract the lo- 
cal mean forces and torques acting on each 
base-pair of bound DNA from high-resolution 
complex structures. Our analysis relies on 
known elastic potentials and a careful choice 
of coordinates for the well-established rigid 
base-pair model of DNA. The results are ro- 
bust with respect to parameter and confor- 
mation uncertainty. They reveal the com- 
plex nano-mechanical patterns of interaction 
between proteins and DNA. Being non-trivially 
and non-locally related to observed DNA con- 
formations, base-pair forces and torques pro- 
vide a new view on DNA-protein binding that 
complements structural analysis. 
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A large class of DNA-binding proteins induce 
deformations of the DNA double helix which 
are essential in biochemical processes such as 
transcription regulation, DNA packing and 
replication [T]. Insight into the mechanism 
of binding largely depends on high-resolution 
structures of DNA-protein complexes. A first 
step in their analysis consists of a description 
of DNA conformation in the complex, often in 
terms of a suitably reduced set of degrees of 
freedom such as the rigid base-pair parameters, 
e.g. [2J. As a second step, sites of local DNA de- 
formation can be identified by comparison with 
ensembles of fluctuating DNA conformations. 
This allows to quantify deformation strength 
in terms of a free energy. Here we take the 
analysis a step further by extracting the points 
of attack, magnitudes and directions of forces 
acting between protein and DNA in the com- 
plex. 

The basic idea of inferring the force on an 
elastic body from its deformation is as com- 
monplace as stepping on a scale to measure 
one's weight. We propose to apply the same 
idea to DNA-protein complexes, using DNA 
as a nanoscale force probe calibrated by a 
known elastic potential. That is, starting 
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from the coarse-grained mean conformation of 
a piece of bound DNA as extracted from a high- 
resolution structural model, we infer the corre- 
sponding coarse-grained static mean forces re- 
quired for that conformation. To implement 
this idea, we use the rigid base-pair level of 
coarse-graining. Correspondingly, our analysis 
results in a DNA base-pair step elastic energy 
profile, complemented by the set of mean forces 
and torques by which the protein acts on each 
DNA base-pair. 

This article focuses on the theoretical basis, 
implementation, range of applicability and val- 
idation of DNA nano-mechanics analysis. We 
begin by discussing the statistical mechanics 
of the mechanical equilibrium in DNA-protein 
complexes in section Background. We also mo- 
tivate our choice of the rigid base-pair level of 
coarse- graining which, unlike standard molecu- 
lar mechanics with atomistic force fields, allows 
reliable extraction of mean forces within the ex- 
perimentally available resolution. Our matrix 
formalism for force and torque calculations is 
described in section DNA nano-mechanics, and 
implementation and parameter choice details 
are given in Methods. In the Results section, 
we present exemplary force and torque calcu- 
lations for several high-resolution NMR and x- 
ray complex structures. These examples show 
the robustness of the analysis with respect to 
experimental and parameter uncertainties, and 
demonstrate the key features of base-pair forces 
and torques described in the Discussion sec- 
tion: Base-pair forces and DNA deformation 
are nontrivially and non-locally related, and 
they allow to discriminate force-transmitting 
and non-transmitting protein-DNA contacts. 
Based on these features, DNA nano-mechanics 
analysis has a number of promising applica- 
tions, such as validation and design of coarse- 
grained molecular models for multi-scale sim- 
ulations, and identification of target sites for 
structure-changing mutations in protein-DNA 
complexes. These are expanded upon in the 
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Figure 1: Constraint force fd and externally 
applied force / in a stereotyped double-well 
free energy landscape, and thermal distribu- 
tion (solid lines). Under an external force /, 
the landscape is tilted (dashed lines). 

Conclusion section. 

Background 

The statistical mechanics of DNA can be de- 
scribed on multiple levels of coarse-graining, 
depending on the required amount of detail. 
For a chosen set of reduced coordinates {x}, 
the corresponding free energy A(x) is a poten- 
tial of mean force, i.e. constraining the system 
to a conformation Xd requires the mean force 
fd = A'(xd) ■ This holds regardless whether or 
not A is approximately quadratic; for instance, 
A could have a shape as in Fig. [T] 

From high-resolution structures of protein- 
DNA complexes one obtains the mean confor- 
mation Xd within some uncertainty Sj, and the 
size B 1 / 2 = ((x - Xd) 2 ) 1//2 of thermal fluctua- 
tions around it. If the force is approximately 
linear over the range B 1 ' 2 , i.e. if 

A'" < 2A'/B, (1) 

then the mean force by which the environment 
acts upon DNA to produce the observed con- 
formation is given as fd = A'(xd) ± A"(x<i)5d- 
This simple scheme fails for atomistic force 
fields since they are strongly nonlinear on the 
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B 1 / 2 > 5 A, violating Ineq. [I] Mean atomistic 
forces would have to be extracted from a full 
MD simulation with adequately constrained 
average (not instantaneous) positions. In con- 
trast, mean forces acting on groups of atoms 
may be meaningfully extracted from coarse- 
grained descriptions of a single structure, when 
the smoother coarse-grained free energy is com- 
patible with Ineq. [T| 

Here we consider the rigid base-pair [3] 
model of DNA as a good compromise between 
resolution and reliability. The correspond- 
ing sequence-dependent free energies have been 
parametrized from microscopic data [U E] with 
considerable effort. They have been success- 
fully used in describing indirect readout [SI El El 
E] and match well with known, /xm scale DNA 
elastic properties [TO]. The free energy func- 
tions, and therefore the extracted forces, are 
reliable within some sufficiently sampled region 
around the ideal B-DNA ground state which 
excludes only the most extreme deformations, 
see Discussion. Notably, the sampled free en- 
ergy within this region is well approximated by 
a quadratic function H], leading to linear 
elastic forces and validity of Ineq. [T] While ex- 
tensions of the free energy function into the an- 
harmonic region and inclusion of trinucleotide 
coupling could further improve on the range 
of validity and accuracy base-pair forces, the 
present parametrization leads to consistent re- 
sults, as shown below. Note that by choos- 
ing the rigid base-pair level of coarse-graining, 
all information on force pairs that cancel on 
smaller scales, e.g. separation of bases, is disre- 
garded. The resulting description retains those 
forces that are relevant for large scale DNA de- 
formations. 

DNA nano-mechanics 

In the rigid base-pair model of DNA [3J, base- 
pairs are represented by as rigid bodies with- 




Figure 2: Rigid base-pair model. Atomic coor- 
dinates are reduced to the base-pair center po- 
sition and orientation degrees of freedom, rep- 
resented as bricks. Conformations of the Ippo- 
I complex before (left) and after (right) pre- 
relaxation within a range of r m = 0.3A, see 
Methods. 

out internal structure, see Fig. [2] From the 
atomic coordinates of a base-pair k, a refer- 
ence frame gfc is derived in a standardized way 
[TT] [TO] , which specifies the base-pair orienta- 
tion Rfc and position in space pfc, both given 
relative to some fixed lab frame. The confor- 
mation of a piece of DNA is described by the 
chain G = (gi,g2, • • ■ ,gn) of base-pair refer- 
ence frames. The base-pair step conformations 
are denoted by gfcfc+i, i-e. the orientation and 
position of gfc+i relative to gfc. The data Rfc 
and pfc may be represented in a number of dif- 
ferent coordinate systems, including Euler an- 
gles, exponential coordinates, etc. Similarly, 
the relative step conformations gfcfc+i are con- 
ventionally discussed in terms of the six base- 
pair step parameters Tilt, Roll, Twist, Shift, 
Slide and Rise [TO]. At the moment, we avoid 
fixing a particular coordinate system, consider- 
ing the frames gfc as abstract elements of the 
rigid motion group. 

Unusual DNA conformations can be recog- 
nized by their low probabilities in an equilib- 
rium ensemble of freely fluctuating DNA. The 
corresponding conformational elastic free en- 
ergy Ab(G) depends on the base sequence of 
the chain B = &1&2 ■ • • b n . The free energy Ab 



DNA nano-mechanics 



4 



is a potential of mean force which quantifies 
the strength of deformation of the chain. We 
fix the zero-point so that mine Ab = and 
call Ab the elastic energy of the chain as is 
customary. 

We argue that knowledge of the function Ab 
can provide valuable information when it is 
used to derive mean force^} The elastic gener- 
alized force by which the chain acts on its k-ih 
base pair is given by the negative derivative 
of the chain energy with respect to the k-th 
base pair configuration, — d gfeJ 4s(G). In the 
static equilibrium of a DNA-protein complex, 
this elastic force exerted by DNA is balanced 
by the external force acting on base-pair k 



M(fc) = d Sk A B{G). 



(2) 



The basic force balance Eq. [2] allows to infer 
external forces from DNA conformations. This 
relation holds for general elastic energy func- 
tions, including next-nearest neighbor coupling 
[HI ESI QUI HZ]- However, the best presently 
available full parameter sets approximate the 
elastic energy as a sum of harmonic, nearest- 
neighbor base-pair step energies so that 



Unlike the generalized force itself, the com- 
ponents fJ,(k)i are defined with respect to a par- 
ticular choice of basis. We pick a basis by re- 
quiring that the /in^j have simple physical in- 
terpretations in terms of force and torquej^] To 
formulate this idea, we remark that the frames 
gfc are elements of the rigid motion group and 
can be represented as so-called homogeneous 
matrices, which are well-known in robotics, 
see [15] , This approach has been used for 
coarse-graining the rigid base-pair model [TU] 
and in the context of worm-like chain mod- 
els of DNA [TS]. Here, base-pair frames are 
written explicitly as gfc = [ p ^ ] where 
Rfc is a 3 x 3 rotation matrix and is a 
3x1 column vector. The base-pair step con- 
formations are calculated as a matrix product 
gfcfc+i = gfc 1 gfc+i from the base-pair confor- 



fc 1 



The 



mations, where _, 
corresponding matrix generators Xj for rota- 
tions (1 < % < 3) and translations (4 < % < 6), 
are 4x4 matrices with entries (X-i)jk = £jik + 
SkiSisj. Using this notation, the derivatives 
of Ab with respect to infinitesimal motions of 
base-pair k, 



A B (G) = ^a bfc6fc+1 (g fcfc+ i). (3) 

k=l 

For details on our particular choice, see Meth- 
ods. In this case Eq. [2] specializes to 

fi(k) = d gfc (gfc-lfc) +d gk a bkbk+1 (g kk +i). 

(4) 

One sees that the external force on a base-pair 
gfc balances a sum of two terms, which are just 
the elastic tensions in the steps k — l,k and 
k, k+1, respectively]^] At each end of the chain, 
there is of course only one-sided tension. 



M(Jfe)i 



dh lo 



x By 'forces' we always mean mean forces in the fol- 
lowing. 

2 Eq. [4] is the equivalent of the standard relation 
'force = div stress' of continuum elasticity in the present 
context of a discrete, linear chain. 



A B (gi, ■ ■ ■ ,g fc _i,g fc (l+/iXi),g fc+ i, 

(5) 

have the required simple interpretations: 
0- t (fc)i)i<i<3 are t ne Cartesian components of 
the external torque trfy on base-pair k about 
an axis through p^., while (^(k)i)i<i<6 are 
Cartesian components of the external force fn^ 
attacking at p^. These components are rela- 
tive to the base-pair fixed triad Rfc. For ac- 
tually calculating the components Eq. [5] it 
is convenient to rewrite the step energy aw 
which is usually given in terms of the rigid 
base-pair step parameters, in terms of expo- 
nential coordinates, see Supplementary Mate- 



rial, Text supp-1 



This requirement precludes the use of Euler angles 
for the orientation Rfe. 
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Figure 3: Force and torque pairs acting on 
DNA to produce an excess of one base-pair 
step parameter in each row. Torque vectors 
t(i), t( 2 ) shown in blue, force vectors fm, f( 2 ), in 
red. The same deformation of the central base- 
pair step can be produced by external force and 
torque pairs attacking directly (left column), 
at the nearest neighbor base-pairs (middle col- 
umn), or seven base-pairs away (right column). 
Sequence-averaged MP parameters. For plots 
of the base-pair step parameters associated 



An overview of the relation between exter- 
nal forces and torques, and base-pair step de- 
formations is given in Fig. |3} Consider first the 
left-hand column. The force and torque pairs 
required for increasing each of the six base- 
pair step parameters demonstrate the strong 
coupling between the different base-pair defor- 
mation modes of B-form DNA. E.g. to pro- 
duce pure increased Twist, overwinding torque 
must be assisted by compressive forces as a con- 
sequence of the counter-intuitive twist-stretch 
coupling [2UJ |2U [22] • In addition, there exists 
a geometric coupling effect: Force balance re- 
quires that force pairs sum to zero. However, 
they need not be collinear; any offset of their 
lines of attack generates an additional torque 
'by leverage' which enters the torque balance. 
Thus the external torque vector pairs shown in 
Fig. [3] do generally not sum to zero. 

A base-pair step can be deformed by external 
forces acting directly on its constituent base- 
pairs, but also indirectly, by external forces 
at distant base-pairs. Examples of this non- 
local effect are shown in the middle and right 
hand columns of Fig. [3j Here the central 
base-pair step of each chain has the same de- 
formation as in the left hand column; how- 
ever this time produced indirectly, by external 
forces applied only at the chain ends. Along 
the chain, tensions are non-zero but balanced 
so that DNA assumes a stressed equilibrium 
shape [23] in which all intermittent van- 
ish. These shapes can exhibit strongly non- 
uniform deformation, e.g. non-uniform Twist, 
see Fig. supp-1 (For a related study of stress 
localization in RNA, see [21] •) Given these 



with these equilibrium shapes, see Fig. supp-1 



complicated shapes, it is difficult to guess at 
external forces by structural inspection. 

In the general case of DNA bound to pro- 
tein, external forces may act anywhere along 
the chain. Here each base-pair step deforma- 
tion is caused by a combination of local ex- 
ternal forces and internal propagated tension. 
In this article, instead of investigating equilib- 
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rium shapes for given boundary conditions, we 
focus on the converse question of what local 
external forces and torques are required for a 
given general shape. These forces and torques 
give a quantitative measure for how the pro- 
teins forces DNA into that shape. 



Methods 

Rigid base-pair parameter sets 

To parametrize the base-pair step energy aw, 
step equilibrium conformations g e q,M/ and stiff- 
ness matrices Sw are needed for each dinu- 
cleotide type bb'. We used a combination of 
knowledge-based and simulation-based param- 
eters. Specifically, the relaxed conformation 
of each dinucleotide bb' is the mean confor- 
mation reported for bb' in a crystal structure 
database of protein-bound DNA fragments [5]. 
The stiffness matrix S^/ for each dinucleotide is 
the stiffness matrix reported for bb' result 
of explicit-solvent, all-atom MD simulations of 
a library of oligonucleotides [3]. We have re- 
ported previously on the the relation of this 
hybrid parameter set to pure knowledge-based 
or pure simulation based parameters [7j, and 
have shown that it reproduces known values 
/^m-scale elasticity of DNA without fitting [ID] . 
In these articles and here, the hybrid parame- 
ter set is denoted by 'MP'. To evaluate the ro- 
bustness of our method, we have compared the 
results computed with MP to those computed 
with a pure knowledge-based parameter set 'P'. 
The P parameter set is essentially identical to 
the P-DNA parameter set from [5] constructed 
from mean values and covariance matrices of 
protein-bound DNA [5], the difference being a 
global multiplicative factor to correct the tem- 
perature scale, see [TJ. The dependence of our 
results on the choice of parameter set is dis- 
cussed in Results. 



Restrained Relaxation 

To account for the high, but limited precision 
of structural models, we included an initial pre- 
relaxation stage, a strategy which has been 
used also for atomistic force fields in related 
studies [53 IB]- Here, the rigid base pair co- 
ordinates of the DNA fragment were allowed 
to relax simultaneously, descending the gra- 
dient of A; at the same time, each base-pair 
was restrained to a region around its original 
conformation by a sharply increasing potential. 
We set the size r m of this region on the or- 
der of the atomic position uncertainty. In this 
way elastic tension in DNA is allowed to re- 
lax, but only so little that the relaxed confor- 
mation remains consistent with the reported 
structural data. The parameter r m describes 
the assumed precision of the input data, and 
is not a property of the employed force field. 
To set r m we used the estimated coordinate er- 
ror based on a Luzzati plot as reported in the 
PDB files if available (which is mostly around 
15% of the reported x-ray resolution), and 15% 
of the resolution otherwise. This gave a range 
of r m = 0.28 . . . 0.33A in the shown exam- 
ples. The effect of this relaxation on base-pair 
frame conformations is barely visible by eye, 
see Fig. [2j 

The pre-relaxation procedure can be seen as 
a form of data smoothing with a tendency to 
equalize elastic tension in DNA. Force analysis 
after restrained relaxation produces the set of 
smallest external forces which are compatible 
with the confidence region of the structural in- 
put data. In practice, the effect of relaxation 
was to reduce extreme peaks in the resulting 
energy and force profiles, while relaxing the 
weakest external forces to zero. Thus, relax- 
ation reduces extreme force outliers and elimi- 
nates low-level random noise. 

Although any sensible choice of r m should 
roughly equal the structural uncertainty, its ex- 
act value remains undetermined. The global 
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scale of our computed forces and torques de- 
pends on the value of r m , but their rela- 
tive magnitudes along the chain and their 
directions are only weakly affected by this 
choice. After relaxation, we also found sub- 
stantially increased agreement between ener- 
gies and forces computed using different elas- 
tic parameter sets. These features of pre- 



relaxation are illustrated by Fig. supp-2 



Implementation and Visualization 

Forces and torques in this article were com- 
puted starting from the following Protein Data 
Bank high-resolution structures: lnvp, 
lcdw, lqne for TBP, lllm for lac repressor and 
lczO for lppo-l. Rigid base-pair frames were 
computed from atomic coordinates by least- 
squares fitting of model base-pairs following a 
standardized procedure, using the 3DNA pro- 
gram [TTJ [T2]. Calculation of energies, forces 
and torques as described in section DNA nano- 
mechanics, as well as pre-relaxation were im- 
plemented in Mathematica. Three-dimensional 
vector depictions of base-pair conformations, 
forces and torques were exported as VRML 
files, which are available as Supplementary Ma- 
terial Data S1-S3. They can be visualized 
and superimposed with atomic structure data 
using molecular visualization software, for in- 
stance the free molecular visualization program 
Chimera [27J which was used for the images in 
this article. 

Results 

We have applied an analysis as described above 
for three x-ray co-crystal structures of TATA- 
box binding protein [2E1 ESI ED] and an ensem- 
ble of 20 NMR solution structure conformers 
of a lac repressor complex [3T]- As an exam- 
ple of a trapped intermediate state of a struc- 
tural modification of DNA, we analyzed a co- 
crystal structure of the homing endonuclease 



l-ppol 

Fig. H] illustrates the force analysis of TATA- 
box binding protein (TBP) complexed with 
cognate DNA. In this complex, TBP bends 
DNA into the major groove. The overall turn 
of about 80° is distributed over the eight base- 
pairs of the TATA-box, whose steps have uni- 
form positive roll. The highest deformation en- 
ergy occurs at the first 'TA' dinucleotide step 
whose base-pairs (9 and 10) are separated but 
not strongly kinked. A secondary peak in elas- 
tic energy can be seen at base-pair step 15-16. 
Inspection of the structure shows that at both 
of these locations, a phenylalanin residue par- 
tially intercalates between the base-pairs. The 
initial, straight poly-G DNA region shows only 
small deformation energies. 

Superimposed on the TFIIA-TBP-DNA 
crystal structure |30J, Fig. [I] shows force and 
torque vectors from an analysis of three dif- 
ferent co-crystals of TBP with the same DNA 
binding site. A strong opposing force pair is 
seen to pull apart base-pairs 9 and 10. Along 
the box, the 80° turn is associated with a nicely 
aligned sequence of torque vectors at base-pairs 
10 to 15; they deviate by at most 28° from 
pointing into the major groove. Unlike the 
rather evenly distributed torques, the base-pair 
forces have a minimum in the center of the box, 
and a second peak associated with a force pair 
stretching the base-pair step 15-16. Note that 
the directions of forces and torques at base- 
pairs 9-10 and 15-16 are approximately related 
by a two-fold symmetry around step 12-13 cor- 
responding to the symmetry of TBP; however 
their magnitudes are about half at base-pairs 
12-13. Base-pairs 1-8 are present in only one 
of the crystal structures, and exhibit low force 
and torque magnitudes 

Fig. [5] shows the nano-mechanics analysis of 
an NMR solution structure ensemble of E. coli 
lac repressor bound to DNA. In this complex, 
a wild-type operator with non-palindromic se- 
quence is bound by a homodimer of the lac re- 
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Figure 4: TFIIA(not shown)-TBP-DNA complex with force and torque vectors from three 
different TBP-DNA crystal structures, left. Corresponding energy, force magnitude and torque 
magnitude profiles, right. Here and in the following figures, linear forces f^) are shown as 
red arrows, torques t( fc ) as blue arrows; base-pairs are represented as numbered small boxes 
with sequence coloring, 'A' red, 'T' blue, £ G' green, 'C yellow; the two viewpoints are rotated 
by 90° around the vertical axis. Sequence (5', base-pair l)-GGGGGGGCTATAAAAGG-(3', 
base-pair 17). Allowed relaxation range r m = 0.3A in all complexes. MP parameter set. The 
three-dimensional representations of base-pairs, force and torque vectors used for this figure are 
available, as detailed in Methods (Supplementary Material, Data SI). 



pressor DNA binding domain [21]. The result- 
ing complex structure is only approximately 
two-fold symmetric. The strongest deforma- 
tions occur around the central six base-pairs 
9-14, producing the overall 30° bend of DNA. 
The kink at the symmetry center 11-12 has by 
far the highest elastic energy. 

As in the TBP case, the peak of elastic 
energy is associated with a pair of base-pair 
stretching forces, here accompanied by a strong 
unwinding torque pair. In the central region 
9-14, force and torque directions closely fol- 
low the approximate two-fold symmetry of the 
complex, despite the fact that the sequence is 
asymmetric. Outside the central region, force 
and torque symmetry is broken. Secondary 
peaks in torque and force magnitude can be 



identified at the symmetry-related base-pair 
steps 6-7 and 16-17. At base-pairs 6 and 7, 
a rather weak pair of shearing forces attack, 
while base-pairs 16 and 17 are pulled apart by 
a strong stretching force pair. Also, a strong 
torque on base-pair 19 has no counterpart at 
the symmetry-related position 4, highlighting 
the different binding modes of the two half- 
sites. 

Fig. [6] illustrates the analysis of the homing 
endonuclease lppo-l, bound to target DNA sub- 
strate in an un-cut state. This complex has a 
palindromic operator sequence and an overall 
two-fold symmetry. Cleavage occurs within at 
step 8-9 (and the symmetry-related 12-13) in 
the active form of the complex. In contrast to 
lac, deformation of the operator occurs mainly 
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Figure 5: Complex of lac repressor and DNA. NMR structure of one out of 20 conformers 
with ensemble mean force and torque vectors, left. Two magnified views of the encircled re- 
gion, middle column, with force and torque vectors of all conformers. Ensemble energy, torque 
and force magnitude profiles, right; the ensemble standard deviation profiles (|f(fc)' 2 ' 



|2\V 2 



/NMR dnu 



G( fc )|-/ NMR of force and torque vectors are shown in black. MP parameter set. Sequence (5', 
base-pair l)-GAATTGTGAGCGGATAACAATTT-(3', base-pair 23). The three-dimensional 
representations in Data S2. 



not at the symmetry center 10-11 but within 
the triplet 7-8-9 (12-13-14). The intervening 
base-pair steps are sheared and tilted, produc- 
ing the overall 45° bend of the binding site. 

The computed external forces show that 
these deformation are mainly due to a pair of 
strong opposing forces attacking at base-pairs 

7 and 9 (12 and 14), stretching and shearing 
the triplets. In addition, there is an external 
torque attacking at the intermediate base-pair 

8 (13). While the adjacent base-pair step 9-10 
(11-12) is almost completely relaxed, the cen- 
tral step is sheared by an opposing lateral force 
pair. 



Discussion 

We point out the main general features of an 
analysis of DNA-protein complex structures in 
terms of base-pair forces and torques, using the 
structures presented in the Results section as 
examples. 

Robustness of nano-mechanics analy- 
sis 

The conformational data on which our anal- 
ysis is based, as well as the elastic parameter 
sets, are reliable within certain bounds of error. 
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Figure 6: lppo-l DNA complex. The points of single-strand cuts in the functional complex 
are indicated. Relaxation range r m = 0.3A. The MP parameter set was used for vectors, and 
MP (solid line) and P (dashed line) parameter sets for profiles, right. Sequence (5', base-pair 
l)-TGACTCTCTT-AAGAGAGTCA-(3', base-pair 20). Three-dimensional representations in 
Data S3. 



How do these sources of uncertainty influence 
the derived external forces and torques? 

To assess the dependence on details of crys- 
tallization, we computed forces and torques 
based on three x-ray structures of the TBP 
complex, see Fig. |4j Two of the complexes 
lack TFIIA and all three crystals have dif- 
ferent space groups and thus different crys- 
tal contacts. Nonetheless, their energy, force 
and torque profiles quantitatively agree at 
common base-pairs. As can be seen from 
the three-dimensional representation, also force 
and torque directions agree closely. 

The conformational variability within an 
NMR structure ensemble leads to variability 
of elastic energies, external forces and torques. 
These were computed for a lowest-energy en- 
semble of 20 conformers of the lac repressor 
solution structure [3"T] . see Fig. § We find that 
the main features of the corresponding profiles 



are clearly more pronounced than the variation 
across the ensemble. Also forces and torque di- 
rections are robust among conformers, with the 
exception of the most weakly forced base-pairs. 

When comparing computed forces and 
torques corresponding to the different param- 
eter sets P and MP, we find surprisingly good 
agreement (Fig. [6|, considering the completely 
different sources (crystal structure database for 
P, MD simulation for MP, see Methods) of the 
stiffness parameters. 

The choice of pre-relaxation range r m reflects 
the assumed precision of the structural input. 
It does not strongly affect the relative magni- 
tudes of local forces and torques. However a 
present limitation of the method comes from 
the fact that overall force and torque scales 
vary with r m . Setting r m to the structural 
precision, leads to the lowest external forces 
that are compatible with the considered struc- 
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tural model. While this choice is reasonable, 
it could be improved upon by incorporating 
information on the equilibrium fluctuations of 
bound base-pairs derived from the local B- 
factors. Clearly, a direct comparison to av- 
eraged forces from full simulations of protein- 
DNA complexes would be enlightening. Note 
however that compared to the construction of 
hybrid base-pair potentials [7], the correction of 
known artifacts such as systematic undertwist 
is less straightforward for atomic force fields. 

Force scale and validity range 

The characteristic scales of base-pair forces and 
torques in thermal equilibrium at room tem- 
perature are determined by equipartition of en- 
ergy. They result as 245 pN and 130 pN nm, re- 
spectively. (For base-pair step tensions one ob- 
tains 170 pN and 90 pN nm.) Thus in a thermal 
equilibrium ensemble, the instantaneous forces 
of a harmonic base-pair step are normally dis- 
tributed with width 245 pN, so that on average 
10 % of instantaneous forces are higher than 
400 pN. We conclude that our elastic poten- 
tials are well supported by MD simulation up 
to around 400 pN and 200 pN nm. 

In our examples, only a few of the highest 
force peaks exceed that range. This can be 
expected also for most other complexes: In the 
DNA-protein crystal structure database [5], the 
bulk of DNA base-pair steps was slightly less 
deformed [7j than in the thermal ensemble [3]. 
Note however that outlying base-pair steps do 
occur in the crystal database; they are pref- 
erentially deformed into the softest directions 
[5]. Together with the observation of unex- 
pectedly frequent sharp bending of DNA [33] 
this suggests that the true free energy func- 
tion stays below the harmonic approximation 
for strong deformations, cf. Fig{TJ Thus forces 
and torques tend to be overestimated outside 
the validity range given above. 

Our force field is parametrized by MD simu- 



lations at room temperature, while the crystal 
structures are typically observed at ~ 110 K. 
The computed forces are thus predictions for 
the complex at room temperature under the 
assumption that the mean conformation is es- 
sentially the same as at 110 K. This assump- 
tion is ubiquitous in structural biology when 
interpreting crystal structures in terms of bio- 
logical function. On the other hand, we cannot 
assume that the mean conformation from the 
crystal, or the force field, are valid at temper- 
atures that approach the melting temperature 
of DNA. Not only will the increased thermal 
fluctuations exceed harmonic range of the po- 
tential described above. Also, the stiffness pa- 
rameters themselves change with temperature 
due to the entropic part contained in the elastic 
free energy. Finally, the basic requirement of 
the rigid base-pair model that internal degrees 
of freedom are uncoupled between neighboring 
base-pairs is violated by cooperative base-pair 
opening. As a result, it is hard to estimate 
the temperature range of validity of the present 
force calculation. A very loose upper bound is 
~ 330 K where local bubble formation starts. 

The limitations listed above arise from the 
presently available free energy functions; the 
general procedure of force and torque extrac- 
tion is unchanged for general anharmonic free 
energies, or free energies that include inter-base 
degrees of freedom. 

Protein forces may exceed critical 
forces for DNA structural transitions 

Even though within thermal range, forces of 
hundreds of pN may appear unreasonable given 
that typical critical forces and torques for dis- 
rupting B-DNA structure in single-molecule 
experiments are only f c ~ 65 pN and t c ~ 
40 pN nm [31], and that unzipping occurs al- 
ready at ~ 15 pN. To see that there is 
in fact no contradiction, note the qualitative 
difference between protein forces acting lo- 
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cally on B-DNA, and externally applied micro- 
manipulation forces. Consider again the stereo- 
typical double- well free energy in Fig. [TJ where 
the potential wells at x\ and X2 could for ex- 
ample correspond to B-DNA and to an over- 
stretched state (S-DNA), respectively. With- 
out external force, the ground state is x\, A 
protein which binds DNA constrains the base- 
pair step to a position Xd close to x\. The 
required mean force is fd = A'(xd); it is en- 
tirely determined by the local potential well. 
On the other hand, an external force / pulling 
on the DNA fragment acts by tilting the free 
energy landscape, such that eventually x^ be- 
comes the ground state. The critical external 
force f c = A A/ Ax for the transition is deter- 
mined by the free energy difference between the 
potential wells and their separation, indepen- 
dent of local well steepness. Thus for steep but 
nearly degenerate and well separated potential 
minima, one gets fd > fc for moderate dis- 
placements Xd — x\. Entropic effects have been 
neglected; they add minor corrections. 

In the example of the overstretching tran- 
sition, the critical force is f c ~ 65 pN [53] . 
The stiffness and thermal standard deviation 
of base-pair elongation are ki ~ 10k B T/A 
and 0.33 A, respectively [TO]. Thus, stretching 
the base-pair step by one standard deviation 
already requires twice the critical force (and 
4/ c if the next step is compressed by the same 
amount). 

Sharp elastic energy peaks result from 
balanced pairs of force and torque 

The elastic energy gives a scalar measure of the 
overall deviation of each base-pair step from its 
equilibrium conformation. Elastic energy pro- 
files can therefore quickly show the hot spots 
of local deformation of complexed DNA. A re- 
curring motif in the analyzed profiles is an iso- 
lated high energy peak flanked by low energy 
steps. The energy peak motif generally indi- 



cates a force and torque pair deforming the 
high-energy step, which approximately satisfies 
a local force and torque balance. An example 
is provided by base-pair step 11-12 in the lac 
repressor (Fig. [5J. The three-dimensional force 
and torque vectors show that this step is kinked 
by opposing pairs of force and torque with 
stretching, shearing and underwinding compo- 
nents. Further examples with less complete 
balance is presented by the stretching force pair 
at base-pairs 9 and 10 in TBP (Fig. and by 
the more weakly deformed, symmetry-related 
step 15-16. 

Directions of deformation and force 
are non-trivially related 

Force and torque vectors often do not point into 
the directions one would expect when picturing 
DNA as made up from some uniform isotropic 
elastic material. This non-intuitive feature is 
visible in Fig.[3]but occurs also in force analyses 
of complex structures. 

For example, regarding the forces needed to 
produce the 80° turn in TBP (Fig. Q one may 
have two non-exclusive naive expectations: two 
point forces could push the ends of the curved 
region at base-pairs 9 and 16 towards the cen- 
ter of the circle of curvature, compensated by 
a force pulling the center of the curved region 
away from this point; or distributed torques 
along the curved region could try to bend DNA 
into the major groove, their torque vectors 
pointing normal to the local plane of bend- 
ing, i.e. towards one of the backbones. The 
computed forces and torques prove both of 
these expectations wrong. Clearly all forces 
observed in the complex point roughly along 
the local helical axis, not perpendicular to it; 
and distributed torques do occur but point 
into the major groove, at right angles to the 
expected direction. Thus, the coupled me- 
chanical properties of DNA produce the ob- 
served 80° turn of the TATA box by an array 
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of torques as would result, from pulling both 
sugar-phosphate backbones into the respective 
3' direction. Interestingly, in an MD simula- 
tion of a TATA sequence without protein [3"5] . 
this mode of external 3'-pulling was observed to 
produce a bent shape that mimics the bound 
conformation of DNA in the complex. 

Another example of this coupling is the 36° 
Roll of the base-pair step 9-10 of the lac repres- 
sor complex. As can be seen in Fig. [5] it results 
not from a bending torque but mainly from 
stretching and underwinding by the protein. 
In summary, the non-trivial nano-mechanical 
properties of DNA make it impossible to tell 
by eye what force and torque directions are re- 
quired for a particular shape. 

Forces require DNA-protein contacts, 
but contacts do not always transmit 
force 

In the absence of long-range interactions, only 
base-pairs that are contacted by protein should 
ever experience external forces. The converse 
is not true: not every contact can be expected 
to actually transmit force. These requirements 
allow for a consistency check of a DNA nano- 
mechanics analysis by comparing calculated ex- 
ternal forces with observed contact points. For 
example, in the lac repressor, base-pairs 2, 3, 
21 and 22 are non- contacted, and indeed their 
forces and torques are weak. The remaining 
magnitude gives an estimate of the error in 
force determination of about 10% of the peak 
force. This error estimate agrees with the stan- 
dard deviation of forces and torques across 
the NMR ensemble, cf. Fig. [5} We conclude 
that possible systematic errors in force deter- 
mination are smaller than the uncertainty due 
to limited structural precision. In contrast, 
the calculated forces and torques at the non- 
contacted end base-pairs 1 and 23 of the bind- 
ing site are about three times the error esti- 
mate. Their non-zero forces point to a sys- 



tematic error, possibly due to the dissimilar 
properties of internal and end base-pairs, see 
Methods. Base-pair 9 in the same complex is 
an example of a base-pair in close contact with 
protein residues which experiences forces indis- 
tinguishable from 0, showing that contacts do 
not imply local forcing. 

A similar situation can be found in the TBP 
repressor complex [30], Fig. [I] Here the bound- 
ary base-pairs 1 and 2 are contacted by pro- 
tein from a neighboring unit cell, not shown in 
Fig. |4j and are therefore not expected to be 
free. In contrast base-pairs 3 to 7 represent 
a stretch of suspended, non-contacted poly-G 
DNA and are expected to be force-free. Indeed 
their residual force and torque magnitudes are 
only about 20% of the thermal force scale; this 
margin can serve as an error estimate. 

DNA is deformed by a combination of 
local forces and propagated tension 

Apart from local forces, tension from flanking 
DNA can deform a base-pair step, see Fig. [3] A 
well-known extreme example of tension propa- 
gation is the lac operon, where a tight loop of 
many base-pairs is held together by two copies 
of the lac repressor dimer (3BJ E7J ESI EH]- 
While forces are exerted only at the ends, the 
propagated tension deforms DNA along the 
loop. In protein-DNA complexes, the observed 
deformations of bound DNA are generally due 
to the combination of local external forces and 
torques and distant forces and torques, prop- 
agated as tension along the chain. This non- 
local part is always present when forces do not 
balance locally, but becomes most apparent 
when deformations occur without local forces. 
Considering base-pairs 7-9 in the lppo-l com- 
plex, Fig. [6] note that both steps 7-8 and 8-9 
are stressed as indicated by their high elastic 
energy. However base-pair 8 in the middle ex- 
periences only weak external force. This mo- 
tif can be interpreted mechanically as follows: 
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The protein pulls base-pairs 7 and 9 apart by a 
nearly antiparallel force-pair leaving base-pair 
8 suspended freely in the middle. Interestingly, 
the single strand cuts performed by the func- 
tional form of the lppo-l complex occur exactly 
at the pre-stretched base-pair step 8-9 and the 
symmetry-related site 12-13. 

Conclusions 

In this article we have presented an analysis 
of DNA nano-mechanics within a given com- 
plex structure novel but natural way to 
think about the interaction of DNA with pro- 
teins. The free energy function of any coarse- 
grained model allows calculation of the mean 
forces acting on the represented degrees of free- 
dom. Our basic idea is to infer these mean 
forces from structural data and use them to de- 
scribe the mechanical interaction of DNA with 
its environment. This gives an intuitive way to 
interpret the mechanics of DNA-protein bind- 
ing, augmenting the interpretation of molecular 
conformation. 

We have implemented this idea for the rigid 
base-pair model of DNA, using an efficient and 
compact matrix formalism to derive the forces 
and torques acting on base-pairs. This level 
of coarse-graining offers good compromise be- 
tween resolution and reliability. In particu- 
lar, the available parameter sets summarize 
comprehensive structure database analysis and 
large-scale MD simulation efforts. Our method 
puts this large body of DNA elasticity to work 
in a computationally inexpensive way. As we 
have demonstrated, the results are robust with 
respect to differences in crystallization and 
among conformers in an NMR structure en- 
semble, force profiles using different parameter 
sets converge after pre-relaxation, and forces 
on non-contacted base-pairs vanish within the 
estimated error bounds. New, nonlinear or 
poly-nucleotide potentials will lead to improved 



accuracy and range of applicability of nano- 
mechanics analysis. 

From a physical chemistry point of view, 
base-pair forces and torques are interesting in 
their own right since they describe the inter- 
molecular force balance. They are easy to in- 
terpret since they are local quantities. How- 
ever, they are not easy to deduce from a struc- 
ture 'by eye', since they depend non-locally 
on DNA deformation, by propagation of elas- 
tic tension. Thus while it only takes a crystal 
structure as input, DNA nano-mechanics anal- 
ysis can improve on pure conformation analy- 
sis by integrating prior knowledge about DNA 
elasticity. 

To predict indirect readout effects, i.e. se- 
quence specificity of proteins mediated by DNA 
elasticity and structure, it is desirable to calcu- 
late the elastic contributions to protein-DNA 
binding free energies, see e.g. [7J. Nano- 
mechanics analysis is not directly useful for 
this purpose, since energies can be calculated 
directly from the deformations. However, 
good estimates for the total elastic energy of 
a protein-DNA complex require some elastic 
model of the protein. Comparison of predicted 
and structure-base base-pair forces appears a 
good way to validate such coarse-grained pro- 
tein models. 

A related application of our method is to es- 
tablish a connection between multi-scale bio- 
molecular simulation and experimental struc- 
tural data. Common simulation schemes con- 
nect different levels of coarse-graining by force- 
matching |101 E]. Simulated and structure- 
based base-pair forces can be matched with 
little extra effort, suggesting a data-driven 
method for the rational design and validation of 
new coarse-grained protein models. Our anal- 
ysis also establishes a link between structural 
studies and biophysical force measurements on 
short DNA loops |4"2] . 

From a biochemistry point of view, inter- 
pretation of structures in terms of interac- 
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tion forces leads to hypotheses about their 
biological functioning. For instance, DNA- 
modifying proteins such as nucleases inflict 
strong DNA deformations; when trapped inter- 
mediate states can be crystallized, their base- 
pair interaction forces shed light on the reac- 
tion mechanism, see the Ippo-l example in Re- 
sults. Furthermore, the strength of transmit- 
ted force constitutes a measure to classify local 
sites of interaction, as applied in a related ar- 
ticle on nucleosome nano-mechanics [33]. The 
strongest-force contacts play the most impor- 
tant role in enforcing the structural constraints 
implied by binding. Thus, mutation of a DNA 
base or a protein residue affecting a high-force 
contact site is expected to result in strong 
perturbation of the complex structure, while 
small-force contacts should only weakly affect 
the global structure of the complex. So when- 
ever the global DNA conformation on a scale 
of several base-pairs is relevant for biological 
function of a complex, high-force contact sites 
emerge as natural targets for mutation assays. 
We refer the reader to [33] for a first obser- 
vation of the effect of mutations on the force 
patterns. 
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Figure supp-1: Tilt, Roll, Twist (left column) and Shift, Slide, Rise (right column) base-pair pa- 
rameters corresponding to the equilibrium shapes of homogeneous DNA shown in the rightmost 
column of Fig. [3} Tilt and Shift, solid line; Roll and Slide, dashed; Twist and Rise, dotted. At 
the central base-pair step 0, all parameters except for the perturbed one, are at their equilib- 
rium values in each row. One can see that unlike the case of a shearable rod with uncoupled 
modes and isotropic bending [44], the excess twist is not constant. As a consequence, external 
forces acting in non-equilibrium shapes cannot be deduced directly from local excess base-pair 
parameter values. Due to DNA symmetry properties, the two halves of the chain with positive 
and negative bp are identical only in the Roll, Twist, Slide and Rise panels. In these panels, 
the excess base-pair step parameters exhibit (anti)-symmetric profiles. The elastic energy (not 
shown) also varies along the chain. 
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Text supp-1 Base-pair step potentials in exponential coordinates 

We parametrize a base-pair step conformation g^fc+i by letting 

gfcfc+i = g cq ,b fc b fc+1 exp[gj fc+1 X,-] {supp-1) 

where exp is the matrix exponential and g e q,6 fc 6 fc+1 is the equilibrium base-pair step conformation. 
That is, step conformations are given in exponential coordinates on the rigid motion group, based 
at the point g e q,b k b k+1 [ID]- The harmonic step energy function can be written as 

a b b'{gkk+i) = hlk+i S (b k b k+1 )ijq J kk+1 , (supp-2) 

where S can be obtained from stiffness matrices given in base-pair step parameters by multiplying 
with appropriate Jacobian matrices. In the coordinates introduced above, the external forces 
have simple expressions for weakly deformed steps. One obtains to first order, 

/*(*)« = S (b k -ib k )ijli-ik - ( A kk+i S (b k - l b k ))ijQ 3 kk+ i + °(9) 2 > (supp-3) 

where A^+i = Ad(g eq ^ (s 5 ( . +1 ) and Ad denotes the adjoint representation of the group; for 
details, see [TO]. For base-pair steps that are deformed more strongly, it was necessary to 
consider the corrections to this first order result. We therefore postulated the quadratic energy 
Eq. supp-2 to be valid for finite extensions, and recovered firtyi by using the Jacobian matrix 
J exp relating exponential coordinates to the left invariant frame; one then gets 

M(fe)i = (Jjxp(%-ifc) s (6 fc -i& fc ))u9i_ifc ~ ( A fefc-i-l J Lp(?fefe+i) s (6 fc & fc+ x))ii9fcjfe+i- (supp-4) 
Here, the Jacobian is given by J^p(<z) = Jq Ad(exp(— s <fXj)) ds. 
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Figure supp-2: The magnitude of base-pair forces and torques resulting from nano-mechanics 
analysis depends on the range of allowed pre-relaxation r m . Generally, wider allowed relaxation 
ranges result in a reduction of global force and energy scale, scaling roughly as r" 1 . In addition, 
for the smaller values of r m the profile shapes also change. The first row of the figure shows energy 
profiles for lppo-l. The curves correspond to: no relaxation and r m = 0.15, 0.3, 0.6A, from top to 
bottom in each panel. The value r m = 0.3A equals to the reported atomic position uncertainty, 
used in Fig. [6} MP parameter set, left column; P parameter set, right column. The second row 
shows the combined magnitude of external force and torque, computed from the energy a ex t 
associated with a force-torque pair: a ext (^ k) ) = lfjJ k) {S( bk _ lbk ) + A.J fc+1 S (6febfc+l) A kk+1 )~ V(jfc)- 
Values of r m and parameter sets as above. 
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